An Efficient K-Nearest Neighbor Search Algorithm for Distance Matrix Methods in Metric Spaces
Can Sardan
MSc.Student
Computer Engineering Department
Bilkent University
Similarity searching is an important task for management of complex,unstructured data. One of the approaches is to use metric spaces model for indexing. There are two query types used in similarity seaching; a range query that retrieves similar objects to the query for a given radius, and the k-nearest neighbor query, that is used to get the k closest objects to the query. There are two main types of index structures used for metric spaces: tree-based structures and array-based structures, also known as distance matrix methods. Distance matrix methods typically use more space and construction time to produce superior query results. There are some efficient k-nearest neighbor algorithms proposed for tree-based index structures. Some of the techniques used in them have not been adopted to array-based structures. We incorporate these ideas to further improve distance matrix methods. Our approach has two main contributions: firstly, we perform only one sequential pass over the index data; secondly, by estimating a suitable radius to the kth closest object, we minimize the extra CPU overhead for determining good candidate neighbors as early as possible.
DATE:
28 April, 2008, Monday@ 16:40
PLACE:
EA 409